CCS Resource Management in Networked HPC Systems

نویسندگان

  • Axel Keller
  • Alexander Reinefeld
چکیده

CCS is a resource management system for parallel high-performance computers. At the user level, CCS provides vendor-independent access to parallel systems. At the system administrator level, CCS offers tools for controlling (i.e. specifying, con guring and scheduling) the system components that are operated in a computing center. Hence the name \Computing Center Software". CCS provides: hardware-independent scheduling of interactive and batch jobs, partitioning of exclusive and non-exclusive resources, open, extensible interfaces to other resource management systems, a high degree of reliability (e.g. automatic restart of crashed daemons), fault tolerance in the case of network breakdowns. In this paper, we describe CCS as one important component for the access, job distribution, and administration of networked HPC systems in a metacomputing environment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparison of Job Management Systems in Supporting HPC ClusterTools

This paper compares three most common job management systems and their workings with Sun HPC ClusterTools 3.1. Various aspects such as installation, customization, scheduling and resource control issues are discussed. The three chosen systems are: Load Sharing Facility (LSF), Portable Batch System (PBS) and COmputing in DIstributed Networked Environment (CODINE)/ Global Resource Director (GRD)....

متن کامل

Anatomy of a Resource Management System for HPC Clusters

Workstation clusters are often not only used for high-throughput computing in time-sharing mode but also for running complex parallel jobs in space-sharing mode. This poses several difficulties to the resource management system, which must be able to reserve computing resources for exclusive use and also to determine an optimal process mapping for a given system topology. On the basis of our CC...

متن کامل

Virtual Resource Management Based Meteorological Computational Grid

Meteorology is one of the main application areas of high performance computing (HPC) technologies, it is impossible for efficient and accurate weather forecasting and meteorological services without the support of HPC resources. As a kind of networked HPC application environment, meteorological computational Grid makes an integration of diverse computing resources using the Grid technology, pro...

متن کامل

Managing clusters of geographically distributed high-performance computers

We present a software system for the management of geographically distributed highperformance computers. It consists of three components: 1. The Computing Center Software (CCS) is a vendor-independent resource management software for local HPC systems. It controls the mapping and scheduling of interactive and batch jobs on massively parallel systems; 2. The Resource and Service Description (RSD...

متن کامل

Scheduling in HPC Resource Management Systems: Queuing vs. Planning

Nearly all existing HPC systems are operated by resource management systems based on the queuing approach. With the increasing acceptance of grid middleware like Globus, new requirements for the underlying local resource management systems arise. Features like advanced reservation or quality of service are needed to implement high level functions like co-allocation. However it is difficult to r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998